Task-based MT Evaluation: From Who/When/Where Extraction to Event Understanding

نویسندگان

Jamal Laoudi

Calanda R. Tate

Clare R. Voss

چکیده

Task-based machine translation (MT) evaluation asks, how well do people perform text-handling tasks given MT output? This method of evaluation yields an extrinsic assessment of an MT engine, in terms of users’ task performance on MT output. While this method is time-consuming, its key advantage is that MT users and stakeholders understand how to interpret the assessment results. Prior experiments showed that subjects can extract individual who-, when-, and where-type elements of information from MT output passages that were not especially fluent. This paper presents the results of a pilot study to assess a slightly more complex task: when given such wh-items already identified in an MT output passage, how well can subjects properly select from and place these items into wh-typed slots to complete a sentence-template about the passage’s event? The results of the pilot with nearly sixty subjects, while only preliminary, indicate that this task was extremely challenging: given six test templates to complete, half of the subjects had no completely correct templates and 42% had exactly one completely correct template. The provisional interpretation of this pilot study is that event-based template completion defines a task ceiling, against which to evaluate future improvements on MT engines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Task-based MT evaluation

متن کامل

Ground Truth, Reference Truth & “Omniscient Truth” -- Parallel Phrases in Parallel Texts for MT Evaluation

Recently introduced automated methods of evaluating machine translation (MT) systems require the construction of parallel corpora of source language (SL) texts with human reference translations in the target language (TL). We present a novel method of exploiting and augmenting these resources for task-based MT evaluation, assessing how accurately people can extract Who, When, and Where elements...

متن کامل

A Statistical Analysis of Automated MT Evaluation Metrics for Assessments in Task-Based MT Evaluation

This paper applies nonparametric statistical techniques to Machine Translation (MT) Evaluation using data from a large scale task-based study. In particular, the relationship between human task performance on an information extraction task with translated documents and well-known automated translation evaluation metric scores for those documents is studied. Findings from a correlation analysis ...

متن کامل

A Task-Oriented Evaluation Metric for Machine Translation

Evaluation remains an open and fundamental issue for machine translation (MT). The inherent subjectivity of any judgment about the quality of translation, whether human or machine, and the diversity of end uses and users of translated material, contribute to the difficulty of establishing relevant and efficient evaluation methods. The US Federal Intelligent Document Understanding Laboratory (FI...

متن کامل

An Investigation of the Relationship Between Automated Machine Translation Evaluation Metrics and User Performance on an Information Extraction Task

Title of dissertation: AN INVESTIGATION OF THE RELATIONSHIP BETWEEN AUTOMATED MACHINE TRANSLATION EVALUATION METRICS AND USER PERFORMANCE ON AN INFORMATION EXTRACTION TASK Calandra Rilette Tate, Doctor of Philosophy, 2007 Dissertation directed by: Professor Eric V. Slud Department of Mathematics & co-directed by: Professor Bonnie J. Dorr Department of Computer Science This dissertation applies ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Task-based MT Evaluation: From Who/When/Where Extraction to Event Understanding

نویسندگان

چکیده

منابع مشابه

Task-based MT evaluation

Ground Truth, Reference Truth & “Omniscient Truth” -- Parallel Phrases in Parallel Texts for MT Evaluation

A Statistical Analysis of Automated MT Evaluation Metrics for Assessments in Task-Based MT Evaluation

A Task-Oriented Evaluation Metric for Machine Translation

An Investigation of the Relationship Between Automated Machine Translation Evaluation Metrics and User Performance on an Information Extraction Task

عنوان ژورنال:

اشتراک گذاری